H-1B visas are a category of employment-based, non-immigrant visas for temporary foreign workers in the United States. For a foreign national to apply for H1-B visa, a US employer must offer them a job and submit a petition for a H-1B visa to the US immigration department. This is also the most common visa status applied for and held by international students once they complete college or higher education and begin working in a full-time position.
This dataset contains five year’s worth of H-1B petition data, with approximately 3 million records overall. The columns in the dataset include case status, employer name, worksite coordinates, job title, prevailing wage, occupation code, and year filed.
The objective of this project is to analyse and gain further knowledge into the H1B applications filed in the year 2016 in United States of America
The data set with three million rows is filetered down to seventy thousand rows to suit the project requirement. It consists of eleven attributes for the year 2016 after the filtering.
The next major step in data preprocessing is handling with null, N/A vaues and the outliers. All the rows of the class attributes with N/A or null are removed and outliers are dealt with.
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 35 57320 68240 89150 85180 329100000
Below is a box plot and histogram of the prevailing wage(salary) distribution for each group of applicants.
.